pp003 科研论文阅读笔记

Edited by Ben. Get the knowledge flowing and circulating! :)

pp003 科研论文阅读笔记问题引入破题Intro关于graph-based modelThis paper挑战→ 挑战什么是node-level knowledge transfer?总结挑战 → 应对BenchmarksRelated work目前还不懂的知识点知识拓展好词佳句积累读完感受

今天来note一篇自己非常感兴趣的论文¹。

Keywords: Spatio-temporal data, Few-shot learning, Traffic forecasting, Graph neural network

总体评价
论文的书写很赞！

问题引入

矛盾：城市数据稀疏 v.s 机器学习模型需要大量的训练数据

解决：cross-city knowledge transfer (利用城市的相似性，来迁移学习)

具体展开：对于城市计算任务（urban computing tasks），现存的机器学习方法都需要大量的样本（samples）来学习有效的模型。但是既然说的是城市计算，那就要充分考虑城市之间的发展不一致问题：有的城市发展好，可以采集到丰富的数据，有的就相对弱一些，数据比较少，无法满足这样的机器学习算法对样本量的要求。

Spatio-temporal graph learning是一个关键的方法。但，the cost of data collection is so high that make it difficult to train a well-performed model on some developing cities becasue of the few available data. （发展中国家的可用数据太少，无法训练一个很好的模型）

破题

cross-city knowledge transfer has shown its promise.

具体地：从那些数据充足的城市中训练得到的模型可以使那些数据不足的城市的模型训练过程受益。

Intro

当前很多研究城市计算任务的，都是在一种叫做few-shot scenarios下。

[6] model the cities as grids. 首次提出了促进数据稀疏城市的时空预测。为了实现更好地region-to-region的匹配，他们引入了large-scale auxiliary data (social media check-ins data). 这个数据好像是一些社交属性的数据，收集代价有点高，还有可能造成隐私泄漏
[7]提出FLORAL来分类air quality。这个主要通过从一个data-rich的城市transfering the semantically-related dictionary。但是这里面临了一个问题：把知识从一个单一数据源进行迁移可能因为两个对象的不同导致一些negative transfer。
[8]结合meta-learning方法在目标domain中从多元城市来学习a good initialization model。但是这个工作没有考虑不同城市之间以及同一个城市内部的varied feature的不同

更重要的是，上述的方法仅仅applicable to grid-based data. 并不能在graph-based model上兼容。

grid-based data
graph-based model

关于graph-based model

the graph-based model has aroused extensive attention recently and achieved great success in spatio-temporal learning of road-network, metro-network, sensor-network, etc.

This paper

Goal: to transfer the cross-city knowledge in graph-based few shot learning scenarios，同时，探索一下多个城市之间knowledge transfer的影响。

挑战

C1: 如何利用不同城市的知识来提取目标城市的特征（feature extraction）。

当前的meta-learning方法假设知识的迁移在不同城市之间是globally shared，但是真实情况似乎是在同一个城市的不同区域之间spatio-temporal characteristic都有很大的不同。也就是说，现存的方法并不能有效地处理复杂场景的知识迁移

C2: 在不同城市之间如何减轻多种图结构（graph structure）在迁移上的影响。

相比于grid-based data, graph based modeling展示出了城市之间的不同structure information。
结点之间的边清晰地描述了不同特征之间的interaction。
现存的FSL methods忽视了知识迁移过程中结构的重要性，这有可能导致unstable results，更有可能增加structure deviation的风险。

→ 挑战

this paper propose a novel and model-agnostic ST-GFSL framework.

第一份在时空图学习中investigate few-shot scenario的工作。

为了适应不同城市的多样性，ST-GFSL不再像以前一样去学习a globally shared model。它提出基于node-level meta knowledge生成non-shared model parameters，以此增强specific feature extraction.

此外，文章提出了ST-Meta Learner，从局部图结构和时间序列关系中学习node-level meta knowledge。在学习的过程中，ST-GFSL提出基于meta knowledge来重构不同城市的图结构。这个过程定义了一个graph reconstruction loss，用来指导structure-aware learning, 从而避免多个源城市之间的structure deviation。

什么是node-level knowledge transfer?

node-level knowledge transfer通过parameter matching，从源城市和目标城市中具有相似时空特征的结点中检索。

难句剖析：The node-level knowledge transfer is realized through parameter matching, retrieving from nodes with similar spatio-temporal characteristics across source cities and target city. 这里的with翻译为“具有”。

总结

本文工作的主要贡献：

1st work to explore the few-shot scenario in spatial-temporal graph learning tasks

propose a model-agnostic learning framework called ST-GFSL, which generates non-shared parameters by learning node-level meta knowledge 来增强特征提取。node-level knowledge transfer通过相似时空元知识的参数匹配实现。

propose 基于meta-knowledge来重构不同城市的图结构。graph reconstruction loss is further combined 来指导structure-aware few-shot learning, 从而避免了多个城市上的structure deviation。

挑战 → 应对

spatio-temporal graphs among different cities show irregular structures and varied features.
- 不同城市的时空图具有不规则的结构、多样的特征
- 这个问题限制了feasibility of existing Few-Shot Learning (FSL) methods
this paper 提出了一种model-agnostic few-shot learning framework用于时空图学习
- Specifically（具体而言），ST-GFSL提出基于节点级元知识生成非共享参数，通过跨城市知识的传递增强特征提取。
- 目标城市的节点通过参数匹配来传递知识，从相似的时空特征中检索。（这里有个疑问，需要检索什么呢？）
- 进一步地，文章提出了在meta-learning过程中重构图结构。
  - the graph reconstruct loss 被定义，用于指导这个structure-aware learning, 避免在不同的datasets上出现structure deviation。

汲取

哎嗨，这里出现了一个叫：structure-aware learning的东西！似乎很像pp002中的一些xxx-aware learning.

Benchmarks

four traffic speed prediction benchmarks

application of traffic speed prediction on four public urban datasets. （METR-LA, PEMS-BAY, Didi-Chengdu, Didi-Shenzhen datasets.）

【知识点】Spatio-Temporal Graph Learning 是urban computing tasks中，一个fundamental and widely studied problem。

早期的研究Spatio-Temporal Graph Learning问题的视角一般是：time series analysis 提出了一系列的方法：ARIMA, VAR, Kalman Filtering[9]

随着深度学习和图神经网络的兴起，图被用于分析一系列的城市问题中。（图：一种描述空间结构关系的有效数据结构）

[10]提出STG2Seq，基于图来建模multi-step citywide passenger demand.
[2]提出路段的空间邻居和语义邻居来捕捉urban traffic flow的动态特征。
[11]在vehicle上部署IoT设备来感知城市的air quality，通过variational graph autoencoders来评估未知的空气污染。
[12, 13]提出利用deep meta learning生成不同regions的learning ability来提高traffic prediction performance。

但是以上研究都需要大规模训练数据，没考虑data-scarce scenario.

【知识点】Few-Shot learning已经在CV和NLP领域yielded significant progress，例如：MAML, Prototypical Networks, Meta-Transfer Learning.

但是Few-Shot learning在non-Euclidean domains（like graph few-shot learning）还没有被完全explored。

[17]， few-shot learning on graph.Meta-GNN是第一个incorporate meta-learning paradigm into node classification in graphs. 但是他没有充分描述结点之间的interrelation. meta-learning paradigm: MAML
[18]把结点的相对位置和绝对位置assign到图上，来进一步捕捉结点之间的dependencies。
[19],[20]adopt prototypical network的idea，通过寻找nearest class prototypes来进行few-shot node classification。

通过上述分析可以发现，目前的工作都聚焦在few-shot node classification, while很多城市计算问题都是regression problems. 而且，与general attribute network相比，spatio-temporal graph具有更complex and dynamic node characteristics, 直接把few-shot learning methods与vanilla GNN model结合将会infeasible to capture the complicated node correlations.

【知识点】Knowledge transfer解决的是在data-scarce场景中的机器学习问题。

在urban computing tasks中，如何实现不同城市之间的knowledge transfer来减少数据收集的cost以及改进学习效率是一个ongoing research problem.

[7]FLORA就是一个早期通过知识迁移实现air quality classification的工作。具体地，从具有充分的multimodal data and labels的城市迁移知识。
[6]RegionTrans通过划分城市到不同的grids来研究cross-cities知识迁移，用于时空特征匹配。
[8]MetaST提出从multiple cities迁移知识。

但是上述工作无法直接用于this paper的问题，因为，大多数的城市计算问题是回归问题，但是FLORAL是classification 任务。RegionTrans 和 MetaST是为grid-based data设计的，这个无法和graph-based modeling相兼容。同时，RegionTrans引入了additional social media check-in data, 这个减少了它的versatility。此外，FLORA和RegionTrans仅仅focuse on来自于单源城市的只是迁移，如何利用多个城市之间的数据以及避免negative transfer是一个非常值得研究的问题。

SO，这篇文章的目标：从多个graph-based datasets学习the cross-city meta knowledge, 并且在不需要引入auxiliary datasets的情况下，transfer到一个data-scarce city。

目前还不懂的知识点

However, the spatio-temporal graphs among different cities show irregular structures and varied features, which limits the feasibility of existing Few-Shot Learning (FSL) methods.
- 什么是Few-Shot Learning (FSL) methods？

ST-GFSL proposes to generate non-shared parameters based on node-level meta knowledge.
- 什么是non-shared parameters？
- 什么是node-level meta knowledge？

知识拓展

有哪些任务算是urban computing tasks?

traffic flow, taxi demand, air quality forecasting

好词佳句积累

【总结概况文章进行的实验、基准、效果的一句话】We conduct comprehensive experiments on xxx and the results demonstrate the effectiveness of xxx compared with state-of-the-art methods.

In this section, we briefly introduce the relevant research lines to our work.

读完感受

还是得看好论文才有意义！才对得起自己付出的时间！

1 Lu B, Gan X, Zhang W, et al. Spatio-Temporal Graph Few-Shot Learning with Cross-City Knowledge Transfer[C]. Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, 2022: 1162-1172. ↩